Inferring the spread of COVID-19: the role of time-varying reporting rate in epidemiological modelling

The role of epidemiological models is crucial for informing public health officials during a public health emergency, such as the COVID-19 pandemic. However, traditional epidemiological models fail to capture the time-varying effects of mitigation strategies and do not account for under-reporting of active cases, thus introducing bias in the estimation of model parameters. To infer more accurate parameter estimates and to reduce the uncertainty of these estimates, we extend the SIR and SEIR epidemiological models with two time-varying parameters that capture the transmission rate and the rate at which active cases are reported to health officials. Using two real data sets of COVID-19 cases, we perform Bayesian inference via our SIR and SEIR models with time-varying transmission and reporting rates and via their standard counterparts with constant rates; our approach provides parameter estimates with more realistic interpretation, and 1-week ahead predictions with reduced uncertainty. Furthermore, we find consistent under-reporting in the number of active cases in the data that we consider, suggesting that the initial phase of the pandemic was more widespread than previously reported.

the transmission rate to vary in time, our model adapts to heterogeneity of disease outcomes, to mutations in the disease, to asymptomatic transmission, and to mitigation strategies used to slow the spread of the disease. However, a time-varying transmission rate does not fully capture the spread of a disease though a population. Indeed, not all cases are reported due to faulty tests, inability to obtain a test, or false negative results for example. In order to further quantify disease proliferation through a susceptible population, we also allow the reporting rate to vary in time. By including a time-varying reporting rate, our model is more flexible to adapt to underreporting of cases and to advancements in testing reliability. In other words, this time-varying reporting rate allows our model to dynamically adapt to changes in the spread of COVID-19, thus informing more accurately the reporting of positive cases to the health agencies.
During the COVID-19 pandemic, and specifically in the early phases, a lag in testing times and underreported case rates have been observed. Similar studies on the Zika virus note a high occurrence of underreporting and estimate reporting rates by including a separate unreported infected population in epidemiological modelling 7 . Similarly, in estimating the reporting rate for shigellosis, a secondary unreported infected compartment is proposed by Joh et al. 8 . An alternative approach is taken in a study of an influenza pandemic, where a functional form of the reporting rate is estimated as either linearly increasing or constant depending on the time-frame of the pandemic 9,10 . Rather than introducing an additional compartment into an epidemiological model, we instead employ a time-varying parameter of reporting rate. So, we avoid a structural change to our model, while making the model more flexible via time-varying parameterization. Augmenting an epidemiological model is not the only approach to estimate the spread of a disease throughout a population. Alternatively, one could develop an artificial neural network 11 , or a fractional-order epidemiological model 12 .
To demonstrate our approach, we study two regional outbreaks with different mitigation strategies using the COVID-19 case counts, as collated by the New York Times 13 . Considering data from Tennessee and New York showcases the flexibility of our Bayesian model with time-varying transmission and reporting rates, because the spread of COVID-19 in these states presents different modelling challenges. Notably, these two states experienced the initial 6 months of the pandemic in widely differing fashion. Indeed, each state handled the beginning of the pandemic differently, employed different testing strategies, and continue to have different reopening policies 14 . Such widely varying pandemic management strategies are demonstrated by the COVID-19 data sets associated with Tennessee and New York, which exhibit drastically different dynamics of case counts. Our experiments highlight the capacity of our Bayesian state space epidemiological models with time-varying transmission and reporting rates to fit data representing different dynamics of disease spread, to estimate the transmission rate and the reporting rate during the progression of COVID-19, and to reduce the uncertainty in predictions of COVID-19 cases.

Model formulation
S(E)IR epidemiological model. To investigate the spread of COVID-19 through a population, we use the traditional SIR and SEIR epidemiological models, augmenting them with two temporal variables. One of these variables accounts for the rate of transmission, while the other variable quantifies the rate at which positive cases are reported. These time-varying variables account for variation in the transmissibility of a disease and in the rate at which new infections are reported. Consequently, our model is able to account for variation and uncertainty present during a public health emergency.
Allowing the transmission rate of COVID-19 to vary in time, similar to the approach of 15-18 , we are able to capture intervention measures enacted by public health officials, such as mask mandates or shelter-in-place orders, or 'super-spreader' events which have direct impacts on disease prevalence. By modelling the rate at which the disease spreads as a time-varying variable, we may better quantify the spread of the disease, which in turn, yields more accurate information about the course of the pandemic to public health officials concerned with slowing community spread.
Quantifying the transmissibility of a disease does not fully capture its reach and spread, though. If we consider the early phases of the COVID-19 pandemic, wide-spread access to testing was unavailable, and disease prevalence within communities was widely under-reported 19,20 . To this end, we present a novel S(E)IR epidemiological model by introducing two time-varying variables, namely the reporting rate p(t) and the transmission rate β(t) . The reporting rate p(t) captures the percentage of positive cases reported, and the transmission rate β(t) quantifies the rate at which COVID-19 spreads through a population at time t.
The SIR epidemiological model with time-varying transmission rate β(t) is represented by the system of differential equations where w(t) is a stochastic differential equation (SDE), which entails the Wiener process B(t) and controls the transmission rate 21 . In this SDE, we define drift µ and diffusion σ terms, parameterized by θ w , and a function g : R + → R . It is this time-varying parameter that controls the extent to which a disease spreads throughout a population. Lastly, to fully specify the epidemiological model, we write S(t) as individuals susceptible to COVID-19, I(t) as individuals infected with COVID-19, and R(t) as individuals removed from the population who are  www.nature.com/scientificreports/ no longer able to become infected. Susceptible individuals move into the infected compartment at rate β(t) and infected individuals become removed at rate γ. An expansion on the SIR model is the SEIR model which includes an additional 'exposed' compartment E(t) between the 'susceptible' and 'infected' compartments. Individuals who are in the disease's latent period, the 'exposed' compartment, move into the 'infected' compartment with rate κ and are and thus are capable of infecting other susceptible individuals. The system of differential equations that represents the SEIR model can be found in appendix A. The salient difference between the SIR and SEIR models is the inclusion of the 'exposed' state in the latter; individuals move into this compartment once they have been exposed to an infected individual, but are not yet expressing any symptoms. Such a latent phase is pertinent when quantifying the breadth of the pandemic due to the challenges in correctly accounting for such individuals. Moreover, the 'exposed' compartment not only acts as a delay between the 'susceptible' and 'infected' compartments, but individuals in this compartment may transmit the disease before becoming actively infected themselves.
In this study, for both SIR and SEIR models, we take µ = 0 and σ (w(t), θ w ) = θ w , thus assuming that the transmission rate on any day is likely to be the same as the previous day, and set g(·) = log(·) . By choosing µ = 0 and σ (w(t), θ w ) = θ w , the resulting path w(t), defined by the SDE dw(t) = θ w dB(t) , is a Brownian motion. Allowing w(t) to vary in time defines the effective transmission rate β(t) = g −1 (w(t)) , which controls the extent to which a disease spreads between individuals within a population.

State space model.
Incorporating temporal information about the dynamics driving the spread of COVID-19 into our model, we discretize the path of w(t) which defines the transmissibility, and seek to infer p(t) at each time step. These two correlated time series quantify the rate at which COVID-19 spreads through a population 10,22 . Then coupling these time series with observed case counts as reported by public health agencies, we adopt a state-space modelling paradigm for our inference problem 23 .
A state space model relates two discrete time processes by a probabilistic model incorporating both state evolution and observation densities. In the present context, we are given the number of cases reported by the public health officials, and seek to infer the distribution of the reporting rate p(t) and of the transmission rate β(t) . We view the transmission rate and reporting rate as discrete time processes and write β t and p t for the discrete time counterparts of β(t) and p(t), respectively. Making some regularity assumptions 24 on the evolution dynamics of the system given in equation (1), and denoting any sequence as {c t } t≥0 for i ≤ j as c i:j = (c i , c i+1 , . . . , c j ) , we view the stochastic epidemiological model as a state space model, written The transmissibility w t of COVID-19 is controlled by the SDE dw t = θ w dB t , with Wiener process B t , parameterized by θ w . The reporting rate p t evolves according to the SDE dp t = p t dt + ϑ 2 dW t , where ϑ is the diffusion parameter and W t is a Wiener process. The reporting rate p t , coupled with the number of observed cases Y t , and associated parameters θ Y t defines an observation density, Y t ∼ h(·|X t , p t , θ Y ) . We define X t as the latent number of cases, i.e., X t = t t−1 β τ S τ I τ N dτ or X t = t t−1 κE τ dτ , 1 ≤ t ≤ T for an SIR or SEIR model, respectively, and X 0 ∼ π 0 for a prior density π 0 . We write this recursion as F (w t ; θ w ) to make explicit that the solution to Eq. (1) depends on the state w t and can be computed for any value thereof via numerical integration.
To define the observation density h(·|X t , p t , θ Y ) , we first assume that the reporting of new cases are independent Bernoulli random variables, i.e., each case is reported with probability p t . Then the waiting time until the first reported case is geometrically distributed with the same parameter p t and we are interested in the weekly incidence rate, conditional on the number of reported cases Y t . Since Y t is the sum of i.i.d.geometrically distributed random variables with parameter p t , it then follows that Y t |X t ∼ NegBin(p t , X t ) . Invoking the central limit theorem, the observations Y t are approximately Gaussian with mean p t X t and variance p t (1 − p t )X t + (p t X t η) 2 . In this scenario, the variance term contains an additional parameter η , which describes the over-dispersion within a population. See the Appendix for a more detailed discussion about the distribution of Y t . Lastly, we define parameters θ Y = (η, ϑ) and θ X = (κ, γ , X 0 ) , or θ X = (γ , X 0 ) , for the SEIR or SIR model, respectively.

Model parameters.
We now describe the parameters utilized in our model. Firstly, the vector θ Y = (η, ϑ) contains parameters utilized in our model when investigating a temporally-varying reporting rate p t . Here η is incorporated in the observation variance, specifying over-dispersion within the observations, and indicating heterogeneity within a population 25,26 . It is a common occurrence in count data 25,26 and accounts for large variances in individual outcomes. Furthermore it can signal the presence of 'super-spreading' events 27 . Secondly, ϑ is the standard deviation of the reporting rate. By investigating the marginal density of p t at each time step, we may quantify the uncertainty in the rate of reporting of positive cases. We compare our model against a model with constant reporting rate, denoted p c , for the entire time duration. In the model with constant reporting rate, we set θ Y = η , and we may infer the variance a posteriori from Markov chain Monte Carlo (MCMC) integration, but this is a static value that is not capable of adapting to changes in the realities of a pandemic, such as more sensitive or accurate testing methods and the availability of individuals to obtain a test.
For the parameters governing the movement between the compartments of the epidemiological model, aside from the transmission rate β t , we follow 28 and assume that the distribution of γ , governing the movement of individuals from the infected bin to recovered, follows a Gaussian distribution with mean 5.058 days and standard deviation of 1.519 days, with support on the interval of 2.228 and 11.800 days. For our SEIR models, we sample

Results
Parameter uncertainty. Two data sets of reported COVID-19 cases, one from New York and one from Tennessee, demonstrate different evolution dynamics, see Fig. 1b,c respectively for a plot of the weekly new case counts. The pandemic went through a period of sustained exponential growth in New York, primarily in the New York City metropolitan region, then abated to a near constant level in subsequent months. Furthermore during the initial wave in New York, testing was not widely available, and the 7-day rolling average of positive test case peaked at nearly 50% in early April 2020 29 . On the contrary, the Tennessee data, which are representative of the evolution dynamics of case counts for the majority of other states, exhibit a slow initial increase, followed by a first wave in April and a much larger increase in July. The variations between the incidence data for these two states are visible in Fig. 1a. These two datasets represent the initial phase of the pandemic from two distinct perspectives. Firstly, the dramatic increase in the NY case counts during March 2020 induces greater uncertainty in the parameters as compared with the gradual increase in the Tennessee case counts. Secondly, the different levels of induced parameter uncertainty enable us to study how such different levels of uncertainty propagate in time through our model. The estimated transmission rate for New York based on our SIR model with time-varying reporting rate p t exhibits a spike in transmissibility in April and May (Fig. 2b), which agrees with the spike of observed COVID-19 cases in April and May (blue line in Fig. 1a). The typical SIR model with constant reporting rate produces a transmission rate estimate that fails to capture this spike in transmissibility, as seen in Fig. 2a. Recall that posterior estimates of the transmission rate quantify the rate at which susceptible individuals move from the susceptible compartment to an active stage of infection. As demonstrated by Figs. 1a and 2, our SIR model with time-varying reporting rate provides transmission rate estimates that reflect the observed dynamics of transmission more faithfully than a SIR model with constant reporting rate.
Furthermore, the higher rate of change in the number of observed COVID-19 cases in New York between March and May (blue line in Fig. 1a) induces higher uncertainty in the estimation of the transmission rate. Our SIR model with time-varying reporting rate yields wider credible intervals (Fig. 2b) for the estimated transmission rate in New York between March and May in comparison to the SIR model with constant reporting rate (Fig. 2a). Thus, letting the reporting rate vary with time facilitates the detection of periods of higher uncertainty in transmission rate estimates.
The transmission rate estimates obtained by fitting the SIR model with time-varying and with constant reporting rate to the Tennessee COVID-19 case data agree with one another for the period between May and August (see Fig. 3a,b). However, the SIR model with time-varying reporting rate estimates a smaller drop in the transmission rate in April as compared with the SIR model with constant reporting rate. The former model exhibits a reduced reporting rate estimate in April (blue line in Fig. 4b). So, both models capture the small drop in the number of COVID-19 cases observed in Tennessee during April (orange line in Fig. 1a); the flexible SIR model with time-varying reporting rate attributes this drop to decreased reporting rate (Fig. 4b), whereas the SIR model with constant reporting rate attributes the drop to decreased transmission rate (Fig. 3a). There is no evidence to corroborate which of the two interpretations is correct. Nevertheless, the SIR model with time-varying reporting rate has a wider range of potential options, and it explains the drop in cases via reduced reporting rate in April, which is an explanation not available via the SIR model with constant reporting rate. Figure 4 presents the estimated reporting rates for the SIR model with time-varying reporting rate (in blue) and with constant reporting rate (in orange). In the case of New York (Fig. 4a), the 75% and 95% credible intervals for the estimated constant reporting rate are wider than the respective credible intervals for the estimated www.nature.com/scientificreports/ time-varying reporting rate. This indicates that our more flexible SIR model reduces the uncertainty in reporting rate estimation by letting the rate vary with time. The time-varying reporting rate estimates capture an upward trend in reporting rate both in New York (Fig. 4a) and in Tennessee (Fig. 4b), which can be explained by improvements in infrastructure and in available resources to manage the pandemic as time goes by. On the other hand, the SIR model with constant reporting rate can not accommodate such temporal changes in the management and reporting of the pandemic.
As seen in Fig. 4, SIR modelling with constant reporting rate tends to underestimate reporting rates. The disagreement in reporting rate estimation between the SIR models with time-varying and with constant reporting rate is particularly pronounced in the case of New York (Fig. 4b); notice that the straight orange line (constant reporting rate estimate) is lower than the blue line (time-varying reporting rate estimate). The demonstrated underestimation of reporting rates via SIR modelling with constant reporting rate has been previously noted in the literature, and it has been linked to underestimation of the true number of cases and to bias in transmission rate estimation 30 . Our approach based on SIR modelling with time-varying reporting rate provides a principled

Predictive uncertainty.
We make predictions about the number of cases 1 week into the future by fitting SIR and SEIR models with constant or with time-varying reporting rates to New York and to Tennessee data through August 30, 2020. Irrespective of whether constant or time-varying reporting rate is employed, both SIR and SEIR models produce 75% predictive intervals for the number of cases in New York and in Tennessee that contain the observed number of cases (Fig. 5).
Overall, SIR and SEIR models with time-varying reporting rates outperform their counterparts with constant reporting rates in terms of predictive performance (Fig. 5). Firstly, time-varying reporting rates yield narrower predictive intervals, thus reducing predictive uncertainty. Secondly, time-varying reporting rates lead to predictive posterior means that are closer to the observed number of cases. The improved predictive performance attained via modelling based on time-varying reporting rates is observed in three out of the four examined  www.nature.com/scientificreports/ scenarios (SIR and SEIR models fitted to Tennessee data, and SEIR model fitted to New York data), with no apparent improvement in one case (SIR model fitted to New York data). The wider predictive intervals and therefore higher predictive uncertainty in the Tennessee predictions in comparison to the New York predictions shown in Fig. 5 is seemingly counter-intuitive, given the higher volatility of the New York data (Fig. 1a). However, there is an explanation for the higher predictive uncertainty associated with the Tennessee data. The Tennessee Department of Health changed how they defined an active case on September 3, 2020, resulting in a 1-day decrease of approximately 20, 000 reported cases 31 . Consequently, the predictive distribution is skewed to the left in comparison to the ground truth value, as anticipated after considering the change in the definition of an active case by the Tennessee Department of Health.

Discussion
The states of New York and Tennessee experienced the first wave of the COVID-19 pandemic in different fashion. Our modelling strategy is able to dynamically adapt to different mitigation strategies enacted in each locality and accurately reflect the course of the pandemic in these geographic regions. We are able to capture the dynamic nature of the transmission rate when intervention methods are enacted, and can quantify changes in the reporting rate of case counts. This modelling strategy yields actionable results for public health officials entrusted with a community's well-being.
We observe dependence between the time-varying parameters, namely between the transmission and reporting rates, similar to the effects noted by 30 . Indeed, with the significant under-reporting of active cases present in the New York data, a model employing a static reporting rate fails to capture the dynamic nature of COVID-19 transmissibility. As a concrete example, consider the time period from March 1, 2020 through May 24, 2020, when there were 383, 560 active cases reported in New York. Taking the reporting rate inferred by our SIR model, we find that there were 530, 411 active cases, with 95% confidence intervals (403,285, 900,077) , a figure which is corroborated by the study of 32 that identified under-reporting of active COVID-19 cases by considering hospitalization and death rates.
Primarily, a novelty in our modelling approach has been to include a time-varying reporting rate that leads to models which are more likely to fit and explain COVID-19 incidence data. This conclusion is intuitive, since changes in the reporting rate imply changes in the resulting data, so a model with a varying reporting rate is more likely to fit data affected by changes in reporting procedures.
Secondly, we provide a Bayesian approach to quantify uncertainty in relevant epidemiological parameters and in predictions, yielding a source of important information to public health officials tasked with assessing the present state and with suggesting mitigation strategies for subsequent weeks. Our 1-week ahead predictions are accurate, since 75% relevant credible intervals contain the ground truth (Fig. 5a).
The methods described herein are better able to capture not only the time-varying drivers of an epidemic, but also how the reporting of cases changes temporally, thus providing more accurate quantification of the spread of a disease through a susceptible population. Our method provides near real-time actionable information to public health officials, as opposed to methods that use the hospitalization rate 32 or the excess death rate 33,34 , both of which have a time-lag on the order of weeks. Indeed, previous studies have noted the presence of COVID-19 in February 2020, well before any appreciable increase in hospitalizations 35,36 . Quantifying the spread of a disease through a population and the proportion that are going uncounted by public health agencies is an essential tool for these agencies tasked not only with estimating the proportion of a group that is actively infected, but mitigating the disease's impact on a population. Indeed, by providing real-time knowledge of the true number of active infections to public health officials, the timing and severity of mitigation strategies can be better informed, thus reducing the community spread of a disease.
While our model cannot capture all the intricacies involved with the public health infrastructure, such as variability of testing sensitivity, access to testing sites, or individuals taking at-home tests that are not reported to public health agencies, we are able to estimate time-sensitive parameters crucial to slowing the spread of an emerging new disease. Indeed, by providing accurate and actionable information about the spread of a disease throughout a population, public health officials could put in place mitigation strategies to slow the spread of a disease.
Future versions of the model could incorporate additional parameters, such as one describing mobility of subpopulations within a geographic region. Such a parameter could capture heterogeneity within a population, and identify those subgroups at higher or lower risk for infection and transmission due to their movements within a specified time window. Lastly, we plan to further investigate the correlation structure between the transmission and reporting rates, to better quantify their dependencies and effects on each other.

Methods
Bayesian formulation. Our SIR and SEIR models are parameterized by θ = (θ w , θ X , θ Y ) . We factorize the posterior density π(w 0:T , p 0:T , θ|Y 1:T ) of the transmissibility w 0:T , reporting rate p 0:T and model parameters θ , given observations Y 1:T , as follows: According to Eq. (3), we sample from π(w 0:T , p 0:T , θ|Y 1:T ) by alternating between sampling from densities π(w 0:T , p 0:T |Y 1:T , θ) and π(θ|Y 1:T ) via the particle Markov chain Monte Carlo (PMCMC) algorithm of 37 . PMCMC alleviates issues of convergence and insufficient exploration of the sample space that can arise due to correlations and dependencies between variables.
(3) π(w 0:T , p 0:T , θ|Y 1:T ) = π(w 0:T , p 0:T |Y 1:T , θ)π(θ|Y 1:T ). www.nature.com/scientificreports/ Sampling from eq. (3) allows us to infer the time-varying transmission rate β 0:T , the time-varying reporting rate p 0:T , and to make predictions about the future course of the pandemic. Moreover, our Bayesian SIR and SEIR models enable us to quantify the uncertainty of our parameter estimates and of our predictions.

Particle Markov chain Monte Carlo.
To sample from the posterior density of eq. (3), we employ PMCMC sampling 37 . We describe the algorithmic procedure and detail the hyperparameter choices in our model; for an in-depth discussion and theoretical results, see 37,38 . PMCMC alleviates issues with slow MCMC mixing and low acceptance rates that are present in other methodologies for sampling from a joint posterior, such as the pseudomarginal approach of 39 . First, the sequential Monte Carlo (SMC) procedure is described, followed by PMCMC.
The model parameters η, w 0 , σ , and p c or p t are given wide uninformative priors due to the uncertainty about the ongoing pandemic and disparities in reporting data. We model the infection period as a truncated Gaussian distribution with mean of 5.058 days, standard deviation of 1.51, lower bound of 2.228 days and upper bound of 11.8 days, following Lauer et al. 40 . The prior for the latent period E t is obtained from the study of Moghadas et al. 28 , and is modeled as a gamma distribution with shape and scale parameters 1.058 and 2.174 respectively 41 . For the initial proportions of the population in states X 0 we chose a Dirichlet distribution, while constraining the mean of R 0 to be N (0.5, 0.25 2 ) , and let the means of the other compartments be equal. By this choice, we ensure that the condition S t + E t + I t + R t = N or S t + I t + R t = N is satisfied in the respective SEIR or SIR model. Thus, the sum over all compartments in the epidemiological model at each time step is the same as the total population N. Lastly, we ran PMCMC sampling with 5, 000 particles and obtain 50, 000 samples from the posterior after a burn-in period of 5, 000 iterations.
Choice of density for the observational model. A Poisson or a Gaussian approximation can be used for the density h(Y 1:T |X 0:T , p 0:T , θ Y ) of the observational model. Pilot PMCMC runs demonstrate similar effective sample sizes for the Poisson and Gaussian approximations, but higher number of particles and therefore higher computational budget are required for the Poisson approximation. For this reason, a Gaussian approximation is preferred.
Overview of data and of experimental setup. The data used in our experiments are based on daily case counts from March 1, 2020, through August 31, 2020, obtained from the New York Times COVID data repository 13 . In our analysis, we use daily reported case counts and aggregate them on a weekly basis for computational considerations. For one iteration of the PMCMC method, each particle in the ensemble requires the numerical approximation of a system of non-linear ordinary differential equations comprised of T time steps. This computational cost becomes infeasible in the case of daily case counts due to the increased number of particles required for PMCMC sampling.
For the implementation of our model and for PMCMC sampling, we use the Bayesian modelling software libBi 42 and the R packages rbi and rbi.helpers 43,44 . Our models, data and code for reproducing our results can be found at https:// github. com/ aspan naus/ Covid-model.

Data availability
The datasets analyzed during the current study are available in the Covid-Model github repository https:// github. com/ aspan naus/ Covid-model.

Appendices SEIR model definition
The SEIR epidemiological model with time-varying transmission rate β(t) is represented by the system of differential equations www.nature.com/scientificreports/ In this study, we take µ = 0 and θ w = σ , thus assuming that the transmission rate on any day is likely to be the same as the previous day, and set g(·) = log(·) . By choosing µ = 0 and θ w = σ , the resulting path w(t), defined by the stochastic differential equation dw(t) = θ w dB(t) , is Brownian motion. Allowing w(t) to vary in time defines the effective contact rate β(t) , which controls the extent to which a disease spreads between individuals within a population.

SEIR model results
In this appendix, we present results on parameter uncertainty based on SEIR modelling, while the main manuscript presents results on parameter uncertainty based on SIR modelling. We observe that the inclusion of the latent infected compartment in the epidemiological model has a marked impact on the estimate of the timevarying reporting and transmission rates (Figs. 6, 7 and 8). The effect on the latter follows from recalling that the transmissibility is the product of the probability of passing the disease to another individual and the number of interactions with all individuals. If there is in fact a latent phase in the course of COVID-19, then it follows that individuals in the exposed bin could be mixing with the general population, potentially passing on the disease as it transitions from a latent to active infection. In the case of New York data, the posterior mean estimate of the constant reporting rate of the SEIR model has narrower credible intervals (straight orange line in Fig. 8a) than the posterior mean estimate of the constant reporting rate of the SIR model (Fig. 4a). However, the credible intervals for the former seem spurious, since the the posterior mean estimate of the time-varying reporting rate of our SEIR model varies substantially with time (blue line in Fig. 8a). In fact, the posterior predictive means of number of cases are closer to the respective observed number of cases, when employing a time-varying (rather than constant) reporting rate in the SEIR model (Fig. 5b). So, the reduced predictive capacity of the SEIR model with constant reporting rate (in comparison to our SEIR model with time-varying reporting rate) implies further that the estimated constant reporting rate in Fig. 8a is not accurate.